Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Nat Commun ; 15(1): 2955, 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38580696

RESUMO

Physical unclonable functions (PUFs) based on unique tokens generated by random manufacturing processes have been proposed as an alternative to mathematical one-way algorithms. However, these tokens are not distributable, which is a disadvantage for decentralized applications. Finding unclonable, yet distributable functions would help bridge this gap and expand the applications of object-bound cryptography. Here we show that large random DNA pools with a segmented structure of alternating constant and randomly generated portions are able to calculate distinct outputs from millions of inputs in a specific and reproducible manner, in analogy to physical unclonable functions. Our experimental data with pools comprising up to >1010 unique sequences and encompassing >750 comparisons of resulting outputs demonstrate that the proposed chemical unclonable function (CUF) system is robust, distributable, and scalable. Based on this proof of concept, CUF-based anti-counterfeiting systems, non-fungible objects and decentralized multi-user authentication are conceivable.


Assuntos
Algoritmos , Comércio , DNA , Relação Estrutura-Atividade
2.
Nat Commun ; 14(1): 6026, 2023 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-37758710

RESUMO

Archiving data in synthetic DNA offers unprecedented storage density and longevity. Handling and storage introduce errors and biases into DNA-based storage systems, necessitating the use of Error Correction Coding (ECC) which comes at the cost of added redundancy. However, insufficient data on these errors and biases, as well as a lack of modeling tools, limit data-driven ECC development and experimental design. In this study, we present a comprehensive characterisation of the error sources and biases present in the most common DNA data storage workflows, including commercial DNA synthesis, PCR, decay by accelerated aging, and sequencing-by-synthesis. Using the data from 40 sequencing experiments, we build a digital twin of the DNA data storage process, capable of simulating state-of-the-art workflows and reproducing their experimental results. We showcase the digital twin's ability to replace experiments and rationalize the design of redundancy in two case studies, highlighting opportunities for tangible cost savings and data-driven ECC development.


Assuntos
Replicação do DNA , DNA , DNA/genética , Viés , Longevidade
3.
ACS Nano ; 16(11): 17552-17571, 2022 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-36256971

RESUMO

With the total amount of worldwide data skyrocketing, the global data storage demand is predicted to grow to 1.75 × 1014 GB by 2025. Traditional storage methods have difficulties keeping pace given that current storage media have a maximum density of 103 GB/mm3. As such, data production will far exceed the capacity of currently available storage methods. The costs of maintaining and transferring data, as well as the limited lifespans and significant data losses associated with current technologies also demand advanced solutions for information storage. Nature offers a powerful alternative through the storage of information that defines living organisms in unique orders of four bases (A, T, C, G) located in molecules called deoxyribonucleic acid (DNA). DNA molecules as information carriers have many advantages over traditional storage media. Their high storage density, potentially low maintenance cost, ease of synthesis, and chemical modification make them an ideal alternative for information storage. To this end, rapid progress has been made over the past decade by exploiting user-defined DNA materials to encode information. In this review, we discuss the most recent advances of DNA-based data storage with a major focus on the challenges that remain in this promising field, including the current intrinsic low speed in data writing and reading and the high cost per byte stored. Alternatively, data storage relying on DNA nanostructures (as opposed to DNA sequence) as well as on other combinations of nanomaterials and biomolecules are proposed with promising technological and economic advantages. In summarizing the advances that have been made and underlining the challenges that remain, we provide a roadmap for the ongoing research in this rapidly growing field, which will enable the development of technological solutions to the global demand for superior storage methodologies.


Assuntos
DNA , Armazenamento e Recuperação da Informação , Análise de Sequência de DNA/métodos , DNA/química
4.
Commun Biol ; 5(1): 1117, 2022 10 20.
Artigo em Inglês | MEDLINE | ID: mdl-36266439

RESUMO

Synthetic DNA has been proposed as a storage medium for digital information due to its high theoretical storage density and anticipated long storage horizons. However, under all ambient storage conditions, DNA undergoes a slow chemical decay process resulting in nicked (broken) DNA strands, and the information stored in these strands is no longer readable. In this work we design an enzymatic repair procedure, which is applicable to the DNA pool prior to readout and can partially reverse the damage. Through a chemical understanding of the decay process, an overhang at the 3' end of the damaged site is identified as obstructive to repair via the base excision-repair (BER) mechanism. The obstruction can be removed via the enzyme apurinic/apyrimidinic endonuclease I (APE1), thereby enabling repair of hydrolytically damaged DNA via Bst polymerase and Taq ligase. Simulations of damage and repair reveal the benefit of the enzymatic repair step for DNA data storage, especially when data is stored in DNA at high storage densities (=low physical redundancy) and for long time durations.


Assuntos
Reparo do DNA , DNA Liase (Sítios Apurínicos ou Apirimidínicos) , DNA Liase (Sítios Apurínicos ou Apirimidínicos)/genética , DNA Liase (Sítios Apurínicos ou Apirimidínicos)/metabolismo , DNA/genética , Armazenamento e Recuperação da Informação , Desoxirribonuclease I , Ligases
5.
Nat Commun ; 11(1): 5869, 2020 11 18.
Artigo em Inglês | MEDLINE | ID: mdl-33208744

RESUMO

The volume of securely encrypted data transmission required by today's network complexity of people, transactions and interactions increases continuously. To guarantee security of encryption and decryption schemes for exchanging sensitive information, large volumes of true random numbers are required. Here we present a method to exploit the stochastic nature of chemistry by synthesizing DNA strands composed of random nucleotides. We compare three commercial random DNA syntheses giving a measure for robustness and synthesis distribution of nucleotides and show that using DNA for random number generation, we can obtain 7 million GB of randomness from one synthesis run, which can be read out using state-of-the-art sequencing technologies at rates of ca. 300 kB/s. Using the von Neumann algorithm for data compression, we remove bias introduced from human or technological sources and assess randomness using NIST's statistical test suite.


Assuntos
DNA/síntese química , Algoritmos , Sequência de Bases , DNA/genética , Humanos , Análise de Sequência de DNA
6.
Nat Commun ; 11(1): 5345, 2020 10 22.
Artigo em Inglês | MEDLINE | ID: mdl-33093494

RESUMO

Due to its longevity and enormous information density, DNA is an attractive medium for archival storage. The current hamstring of DNA data storage systems-both in cost and speed-is synthesis. The key idea for breaking this bottleneck pursued in this work is to move beyond the low-error and expensive synthesis employed almost exclusively in today's systems, towards cheaper, potentially faster, but high-error synthesis technologies. Here, we demonstrate a DNA storage system that relies on massively parallel light-directed synthesis, which is considerably cheaper than conventional solid-phase synthesis. However, this technology has a high sequence error rate when optimized for speed. We demonstrate that even in this high-error regime, reliable storage of information is possible, by developing a pipeline of algorithms for encoding and reconstruction of the information. In our experiments, we store a file containing sheet music of Mozart, and show perfect data recovery from low synthesis fidelity DNA.


Assuntos
Técnicas de Química Sintética/métodos , DNA/síntese química , Armazenamento e Recuperação da Informação/métodos , Algoritmos , Sequência de Bases , DNA/química , DNA/genética , Biblioteca Gênica , Luz , Método de Monte Carlo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Processos Fotoquímicos , Análise de Sequência de DNA
7.
Angew Chem Int Ed Engl ; 59(22): 8476-8480, 2020 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-32083389

RESUMO

Today, we can read human genomes and store digital data robustly in synthetic DNA. Herein, we report a strategy to intertwine these two technologies to enable the secure storage of valuable information in synthetic DNA, protected with personalized keys. We show that genetic short tandem repeats (STRs) contain sufficient entropy to generate strong encryption keys, and that only one technology, DNA sequencing, is required to simultaneously read the key and the data. Using this approach, we experimentally generated 80 bit strong keys from human DNA, and used such a key to encrypt 17 kB of digital information stored in synthetic DNA. Finally, the decrypted information was recovered perfectly from a single massively parallel sequencing run.


Assuntos
Segurança Computacional , DNA/genética , Genômica , Armazenamento e Recuperação da Informação/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Repetições de Microssatélites/genética
8.
Nat Protoc ; 15(1): 86-101, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31784718

RESUMO

Because of its longevity and enormous information density, DNA is considered a promising data storage medium. In this work, we provide instructions for archiving digital information in the form of DNA and for subsequently retrieving it from the DNA. In principle, information can be represented in DNA by simply mapping the digital information to DNA and synthesizing it. However, imperfections in synthesis, sequencing, storage and handling of the DNA induce errors within the molecules, making error-free information storage challenging. The procedure discussed here enables error-free storage by protecting the information using error-correcting codes. Specifically, in this protocol, we provide the technical details and precise instructions for translating digital information to DNA sequences, physically handling the biomolecules, storing them and subsequently re-obtaining the information by sequencing the DNA. Along with the protocol, we provide computer code that automatically encodes digital information to DNA sequences and decodes the information back from DNA to a digital file. The required software is provided on a Github repository. The protocol relies on commercial DNA synthesis and DNA sequencing via Illumina dye sequencing, and requires 1-2 h of preparation time, 1/2 d for sequencing preparation and 2-4 h for data analysis. This protocol focuses on storage scales of ~100 kB to 15 MB, offering an ideal starting point for small experiments. It can be augmented to enable higher data volumes and random access to the data and also allows for future sequencing and synthesis technologies, by changing the parameters of the encoder/decoder to account for the corresponding error rates.


Assuntos
DNA/genética , Análise de Sequência de DNA/métodos , Sequência de Bases , DNA/química , Modelos Moleculares , Conformação de Ácido Nucleico
9.
Sci Rep ; 9(1): 9663, 2019 07 04.
Artigo em Inglês | MEDLINE | ID: mdl-31273225

RESUMO

Owing to its longevity and enormous information density, DNA, the molecule encoding biological information, has emerged as a promising archival storage medium. However, due to technological constraints, data can only be written onto many short DNA molecules that are stored in an unordered way, and can only be read by sampling from this DNA pool. Moreover, imperfections in writing (synthesis), reading (sequencing), storage, and handling of the DNA, in particular amplification via PCR, lead to a loss of DNA molecules and induce errors within the molecules. In order to design DNA storage systems, a qualitative and quantitative understanding of the errors and the loss of molecules is crucial. In this paper, we characterize those error probabilities by analyzing data from our own experiments as well as from experiments of two different groups. We find that errors within molecules are mainly due to synthesis and sequencing, while imperfections in handling and storage lead to a significant loss of sequences. The aim of our study is to help guide the design of future DNA data storage systems by providing a quantitative and qualitative understanding of the DNA data storage channel.


Assuntos
Algoritmos , DNA/análise , DNA/genética , Testes Diagnósticos de Rotina/normas , Armazenamento e Recuperação da Informação/normas , Análise de Sequência de DNA/métodos , Manejo de Espécimes/normas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Projetos de Pesquisa/normas
11.
ACS Nano ; 9(10): 9564-72, 2015 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-26258812

RESUMO

The concentrations of nanoparticles present in colloidal dispersions are usually measured and given in mass concentration (e.g. mg/mL), and number concentrations can only be obtained by making assumptions about nanoparticle size and morphology. Additionally traditional nanoparticle concentration measures are not very sensitive, and only the presence/absence of millions/billions of particles occurring together can be obtained. Here, we describe a method, which not only intrinsically results in number concentrations, but is also sensitive enough to count individual nanoparticles, one by one. To make this possible, the sensitivity of the polymerase chain reaction (PCR) was combined with a binary (=0/1, yes/no) measurement arrangement, binomial statistics and DNA comprising monodisperse silica nanoparticles. With this method, individual tagged particles in the range of 60-250 nm could be detected and counted in drinking water in absolute number, utilizing a standard qPCR device within 1.5 h of measurement time. For comparison, the method was validated with single particle inductively coupled plasma mass spectrometry (sp-ICPMS).


Assuntos
DNA/análise , Água Potável/análise , Nanopartículas/análise , Reação em Cadeia da Polimerase/instrumentação , Dióxido de Silício/análise , Monitoramento Ambiental/instrumentação , Desenho de Equipamento , Nanopartículas/ultraestrutura , Tamanho da Partícula , Transição de Fase
12.
Angew Chem Int Ed Engl ; 54(8): 2552-5, 2015 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-25650567

RESUMO

Information, such as text printed on paper or images projected onto microfilm, can survive for over 500 years. However, the storage of digital information for time frames exceeding 50 years is challenging. Here we show that digital information can be stored on DNA and recovered without errors for considerably longer time frames. To allow for the perfect recovery of the information, we encapsulate the DNA in an inorganic matrix, and employ error-correcting codes to correct storage-related errors. Specifically, we translated 83 kB of information to 4991 DNA segments, each 158 nucleotides long, which were encapsulated in silica. Accelerated aging experiments were performed to measure DNA decay kinetics, which show that data can be archived on DNA for millennia under a wide range of conditions. The original information could be recovered error free, even after treating the DNA in silica at 70 °C for one week. This is thermally equivalent to storing information on DNA in central Europe for 2000 years.


Assuntos
DNA/química , Armazenamento e Recuperação da Informação/métodos , Dióxido de Silício/química , Algoritmos , Armazenamento e Recuperação da Informação/normas
13.
PLoS One ; 8(5): e64371, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23741321

RESUMO

Nested canalizing Boolean functions (NCF) play an important role in biologically motivated regulatory networks and in signal processing, in particular describing stack filters. It has been conjectured that NCFs have a stabilizing effect on the network dynamics. It is well known that the average sensitivity plays a central role for the stability of (random) Boolean networks. Here we provide a tight upper bound on the average sensitivity of NCFs as a function of the number of relevant input variables. As conjectured in literature this bound is smaller than 4/3. This shows that a large number of functions appearing in biological networks belong to a class that has low average sensitivity, which is even close to a tight lower bound.


Assuntos
Análise de Fourier , Redes Reguladoras de Genes , Modelos Estatísticos , Simulação por Computador
14.
EURASIP J Bioinform Syst Biol ; 2013(1): 6, 2013 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-23642003

RESUMO

: Consider a large Boolean network with a feed forward structure. Given a probability distribution on the inputs, can one find, possibly small, collections of input nodes that determine the states of most other nodes in the network? To answer this question, a notion that quantifies the determinative power of an input over the states of the nodes in the network is needed. We argue that the mutual information (MI) between a given subset of the inputs X={X1,...,Xn} of some node i and its associated function fi(X) quantifies the determinative power of this set of inputs over node i. We compare the determinative power of a set of inputs to the sensitivity to perturbations to these inputs, and find that, maybe surprisingly, an input that has large sensitivity to perturbations does not necessarily have large determinative power. However, for unate functions, which play an important role in genetic regulatory networks, we find a direct relation between MI and sensitivity to perturbations. As an application of our results, we analyze the large-scale regulatory network of Escherichia coli. We identify the most determinative nodes and show that a small subset of those reduces the overall uncertainty of the network state significantly. Furthermore, the network is found to be tolerant to perturbations of its inputs.

15.
EURASIP J Bioinform Syst Biol ; 2011: 6, 2011 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-21989141

RESUMO

Boolean models of regulatory networks are assumed to be tolerant to perturbations. That qualitatively implies that each function can only depend on a few nodes. Biologically motivated constraints further show that functions found in Boolean regulatory networks belong to certain classes of functions, for example, the unate functions. It turns out that these classes have specific properties in the Fourier domain. That motivates us to study the problem of detecting controlling nodes in classes of Boolean networks using spectral techniques. We consider networks with unbalanced functions and functions of an average sensitivity less than 23k, where k is the number of controlling variables for a function. Further, we consider the class of 1-low networks which include unate networks, linear threshold networks, and networks with nested canalyzing functions. We show that the application of spectral learning algorithms leads to both better time and sample complexity for the detection of controlling nodes compared with algorithms based on exhaustive search. For a particular algorithm, we state analytical upper bounds on the number of samples needed to find the controlling nodes of the Boolean functions. Further, improved algorithms for detecting controlling nodes in large-scale unate networks are given and numerically studied.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...